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ABSTRACT 


A  posteriori  forward  error  analysis  is  applied  to  the  Gaussian 

elimination  method  for  solving  system  of  linear  algebraic  equations  of 

the  type  Az  =  p.  By  attributing  the  generated  round-off  errors  properly 

to  the  matrices  A  and  p,  it  is  shown  that  the  computed  z  satisfies  a  new 

perturbed  system  such  that  (A  +  6A)z  =  p  +  6p.  For  large  system  order 

n,  the  upper  bounds  for  6A  and  6p  in  infinite  norm  are  then  shown  to  be 
2  3 

proportional  to  n  ,  instead  of  n  obtained  by  the  usual  backward  error 
analysis  where  round-off  errors  are  attributed  totally  to  the  system 
matrix  A.  This  answers  partially  some  questions  raised  concerning  the 
discrepancy  between  the  theoretical  result  and  practical  observation  of 
the  perturbations. 


m 


1.  Introduction. 

Consider  a  system  of  n  linear  algebraic  equations  in  n  unknowns, 
written  as  Az  =  p  where  A  is  a  square  coefficient  matrix  of  order  n, 
whose  elements  are  real  numbers  a^j  with  a  determinant  det(A)  f  0;  z  and 
p  are  column  vectors,  and  the  components  of  p  are  given  real  numbers. 

It  is  desired  to  find  the  unique  solution  z.  Among  the  classes  of  direct 
methods  in  solving  the  system  Az  =  p,  the  most  popular  one  is  perhaps  the 
class  of  methods  based  on  Gauss's  idea  of  a  systematic  elimination  of 
variables.  The  usual  approach  of  the  Gaussian  elimination  methods  con¬ 
sists  of  the  following  steps:  first,  forward  elimination  with  pivoting 
is  used  to  decompose  A  into  two  factors  L  and  U  such  that  LU  =  A  where  L 
is  a  lower  triangular  matrix  and  U  is  an  upper  triangular  matrix;  secondly, 
substitution  is  then  used  to  solve  the  decomposed  system  LUz  =  p  in  the 
sequence  Lv  =  p  and  Uz  =  v. 

The  backward  error  analvsis  of  this  class  of  methods  [1,2]  shows  that 
the  computed  z  satisfies  a  perturbed  system  such  that  (A  +  AA)z  =  p.  For 
large  system  order  n,  the  upper  bound  for  AA  in  infinite  norm  is  propor¬ 
tional  to  n'*.  This  is  mainly  due  to  the  multiplicative  accumulation  of 
perturbations  attributed  to  the  matrices  I.  and  U  in  solving  the  triangular 
systems. 

By  attributing  the  generated  round-off  errors  properly  to  both  A  and 
p,  a  posteriori  forward  error  analysis  [3]  is  carried  out  in  this  paper 
to  analyze  the  Gaussian  elimination  method.  The  results  show  that  in 
solving  the  triangular  systems  the  accumulation  of  perturbations  is 
additive  instead  of  multiplicative.  It  is  also  shown  that  the  computed 


z  satisfies  a  new  perturbed  system  such  that  (A  +  6A)z  =  p  +  6p  where 
the  upper  bounds  for  SA  and  6p  are  proportional  to  n  for  large  n.  This 
result  is  then  used  to  explain  some  inconsistent  interpretation:'  of  the 
results  of  backward  error  analysis. 

2.  Some  basic  lermnas. 

Throughout  this  paper  the  infinite  norm  of  a  vector  x  is  used  as  our 
vector  norm.  For  simplicity  it  is  denoted  as  | |x| | .  In  association  with 
this  vector  norm,  the  infinite  matrix  norm  is  also  defined.  Thus  we  have, 
for  any  vector  x  and  matrix  A, 

I |x| |  =  max | xi |  , 

1  (2.1) 

J  |A| |  =  max  l  | a.. | . 
i  j  J 

We  next  define  | • |  as  the  result  of  replacing  all  elements  of  the 
argument  by  the  corresponding  absolute  values.  Thus  for  a  scalar  s,  is1 
is  simply  its  magnitude;  for  a  vector  v  =  (v^),  |v|  is  a  vector  with 
elements  |vj;  for  a  matrix  M  =  (m^),  |M|  is  a  matrix  with  elements 
|nUj|.  Furthermore  the  inequality  |A|  £  |B|  implies  |a—|  £  |b  — I  for 
all  i,j.  We  have  the  following  lemma  which  can  easily  be  proved: 

Lemma  2.1.  With  respect  to  the  norms  defined  in  (2.1),  we  have 

(i)  I  lx!!  =  11  lx|  II, 

(ii)  ! |A| |  =  ||  | A |  ||, 

(iii)  | |AB| |  <  | |  | A| • | B|  | | , 

(iv)  | A |  <  |B|  -  | | A | |  <  | |B| | . 


(2.2) 


r 


Now  we  will  only  consider  normalized  floating-point  computations  with 
t  bits  allocated  to  the  mantissa  of  a  floating-point  number.  Given  two 
floating-point  numbers  x,  y,  we  shall  denote  by  f£(x*y)  the  correctly 
rounded  result  of  any  floating-point  operation  *.  For  a  posteriori  error 
analysis,  we  need  the  following  lenma  [1]: 

Lemma  2.2.  Let  *  denote  any  of  the  operators  + ,  - ,  * ,  / .  Then 

(1  +  5)ft(x*y)  =  x*y»  1 6|  _<  2_t  =  u.  (2.3) 

We  see  that  Lenina  2.2  indeed  tells  us  the  a  posteriori  error  (<5}f£(x*y) 
which  is  the  difference  be  tween  the  exact  result  x*v  and  the  computed 
result  R(x*v).  Furthermore  the  bound  for  the  error  can  easily  be 
estimated  for  each  operation.  For  algorithms  with  a  finite  number  of 
these  basic  operations,  the  repeated  use  of  Lemma  2.2  will  enable  us  to 
monitor  the  error  generated  a*  step  of  computation. 

3.  The  triangular  systems. 

Consider  a  triangular  system  of  linear  equations  defined  as 

Lv  =  p  (3.1) 

where  I.  =  («,..)  is  a  non-singular  n-th  order  triangular  matrix  and  p  is 
a  given  n-vector.  Let  us  new  define  L  ^  as  an  n-th  order  matrix  with 
g,  as  its  (s,t)-th  element  and  0  or  1  as  the  off-diagonal  or  diagonal 
element'',  respectively.  Thus  for  a  3  *  3  system,  L^j  and  L^  will  be 


3 


and 


J21 


L,-  = 


[1 

0 

0 


0  o] 


21 


1 

0 

0 

1 

0 


(3.2) 


33 


(3.3) 


respectively.  With  L  defined  above,  we  have  the  following  theorem: 

Theorem  3.1.  For  the  lower  triangular  matrix  L  defined  in  (3.1),  let 
(k) 

Lv  denote  an  n-th  order  identity  matrix  with  its  k-th  row  replaced  by 
the  k-th  row  of  L.  Then  we  have 


LkiLk2  -  Wk  ■  lW'  dd" 

(ii)  •••  L(n)  =  L. 


(3.4) 

(3.5) 


Equations  (3.4)  and  (3.5)  can  easily  be  proved  by  induction.  From  Theorem 
3.1,  we  see  that  solving  (3.1)  is  equivalent  to  solving  a  decomposed 
system 

l(1)l(2)  ...  L(n)v  =  p  (3.6) 

which  can  be  solved  in  the  sequence 
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L 

L 


CDpCD 

C2)p(2) 


(3.7) 


L(n)pCn)  =  p^, 

vpW 

fk)  fk) 

Again  by  Theorem  3.1,  each  of  the  equations  in  (3.7),  say  1/  ^pv  = 
fk-11 

p1-  ,  is  equivalent  to 


hah*  -  *  p™ 


(3.8) 


which  can  also  t  solved  in  a  new  sequence 


P 

Lklp 

Hi* 


00,0 

00,1 

00,2 


-  p»-D 

-PW»° 

-pW'1 


> 


» 


Lkkp 


(k)  ,k- 1 
(k),k< 


(3.9) 


(k)  i 

Expressing  a  specific  equation  of  (3.9)  in  detail,  say  L^p  = 
p(k)»j'l,  1  _<  j  <  k,  we  have 
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Equation  (3.10)  shows  that  the  only  non-trivial  computation  is  that  to 
obtain 


„  OOJ-l.pOOJ-1, 


1  <_  j  <  k. 


(3.11) 


For  j  =  k  we  sinply  have 


6 


r 


Jto.k  .  _(k),k-l. 

Pic  pk  /llkk- 


(3.12) 


Thus  we  have  reduced  the  solution  of  a  general  lower  triangular  system 

to  the  solution  of  a  sequence  cf  decompositions  in  which  at  most  two 

elementary  operations  are  needed  for  each  decomposition.  If  we  define 

s(k),j  _  pOOjj)^  then  computational Iv  (3.11)  and  (3.12)  become 

kj  3 


1  <  j  <  k, 


s(k),j  =  f^c-^pWJ)  ^ 

p(k),j  .  fa(s(k),j  +  p(k),j-l )J 


p£k),k  =  f4(p^k:)  >k"1/^kk)  ■ 


Applying  Lemma  2.2  to  (5.13)  and  (3.14),  we  have 


vf,ij*p(k,>Wk,-Vs(,0-V 


(3.13) 


(3.14) 


pOO  .M 
Pk 


(3.15) 


6kjl ,  l«kjl  1  u,  1  <  j  <  k, 


*lckpkk)  ’k  *  '  Pk11’’''1'  l6kkl  i  t3'16) 


In  matrix  formulation,  we  have 


t  p(k),j  +  e(k),j  .  pck)  *  j'lf  l  <  j  <  k, 


^p(k),k  +  e(k),k  =  p(k),k-i 


(3.17) 


(3.1?) 


fk)  i  fkl  k 

where  e  and  ek  ’  are  n-vectors  whose  only  non- zero  elements  are 
the  k-th  elements 
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oo.j.poo.jj^  i<j<k> 


k  j : 


(3.19) 


and 


e(k),k  _  (k),k. 

ek  W  6kk- 


(3.20) 


Premultiplying  both  sides  of  (3.18)  by  •••  Lk  k  l  and  using  (3.17), 

we  have 


*  £<W  ■  P(k-» 


(3.21) 


where 


=eW.l  +  L,ie(k),2  (k),3+  ...  + 


hihz  *“  hc.k-l6 


+  hclh^ 

(k),k 


(3.22) 


Now 


the  only  effect  of  premultiplying  with  e^fi  is  to  add  an 


additional  term  JL  to  the  k-th  element  of  e^’-**  c’""  o(k)»i 


kj  j 


;  since  e: 

3 


is  zero  for  j  /  k,  hence  we  have 

i  „  (k) ,  j  _  (k) ,  j  .  ,  . 

Lkie  ”  e  >  1  r  3- 


(3.23) 


Applying  (3.23)  to  (3.22),  we  have 


e(k)  =  y  e®*1. 

i=l 


(3.24) 


(kl  (kl 

Furthermore,  the  only  non- zero  element  of  ’  is  the  k-th  element  ek  ’ 


which  is  equal  to 
k-1 


(k)  .  (p<k).jSk.  .  5»).iS..]  .  (3.25) 


1 


¥ 


Equation  (3.21)  can  also  be  expressed  as 


L(k)p(k)  +  £(k)  =p(k-D<  (3>26) 

Extending  (3.26)  to  k  =  1,  2,  •••,  n  and  combining  these  equations,  we 
have 

l(!)l(2)  ...  L(n)y  +  e  ,  p  (3,27) 

where 

e.  e(D  +  ♦ 

(3.28) 

l(Dl(2)  ..  L(n-D£(n)i 


Again  we  have 

l0)€«)  .  Ji)_  j  <  j.,,  (3.29) 

since  the  first  i-1  elements  of  are  tore.  Hence  (5.28)  simplifies 
to 

n 


e-  l  - 

i=l 


(i) 


(3.30) 


Xow  if  we  define 


p  =  max 
P  k,  j 


xjjpOO'Sl.  |s«-j|]. 


l<k<n,  1  <  j  <  k,  (3.31) 


and 


aL  =  maxl^l 
k 


(3.32) 
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1 

j 


Then  an  upper  bound  for  the  k-th  element  of  e,  or  e£KJ  in  (3.25),  can  be 
estimated  as 

|e^l  1  [ 2 (k-1)  +  oL]ppu,  1  <  k  £  n.  (3.33) 

Thus  we  have  proved  the  following  theorem: 

Theorem  5.2.  In  solving  the  triangular  system  of  equations  (3.1), 
the  solution  v  computed  by  the  sequential  de compos it ions  of  p  satisfies 
the  equation 


Lv  +  e  =  p 


where  e  is  defined  by  (3.30);  furthermore 


e  < 


2(0)  +  oL 
2(1)  ♦  aL 
2(2)  +  aL 


2(n-l)  +  a. 


P  u, 

P 


|e|  |  <  1 1  ) e)  1 1  <  [2(n-l)  +  ojp  u. 


(3.34) 


(3.35) 


(3.36) 


Now  we  observe  that  (3.8)  can  also  be  written  as 


whose  solutions  are  easily  obtained  as 


U 


1 

*Tck 


k-1 

& 


00 


* 


and 


(k-1) 

j 


j  f  k. 


(3.38) 
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J 


The  algorithm  expressed  in  equation  (3.38)  is  exactly  the  substitution 
algorithm  used  in  Gaussian  elimination  to  solve  the  decomposed  triangular 
systems.  Furthermore,  if  the  inner  product  in  (3.38)  is  evaluated  first, 
then  the  computations  are  executed  in  exactly  the  same  sequence  as  that 
in  (3.9).  Thus  computationally  the  decomposition  algorithm  expressed  in 
(3.7)  and  (3.9)  are  equivalent  to  the  conventional  substitution  algorithm. 
However,  if  we  follow  the  usual  backward  error  analysis,  the  computed  v 
can  be  shown  [1]  to  satisfy: 

(L  +  AL) v  =  p  (3.39) 

where 

|  |  AL 1 1  <  n(n+l)  max|Jl.  .  |u.  (3.40) 

i»j  J 

Comparing  (3.36)  and  (3.40),  we  have  the  following  comments: 

(a)  The  bound  for  e  in  (3.36)  is  a  function  of  a^,  pp  and  n  in  which 

p  and  a,  are  relatively  stationary  for  computations  with  sufficient 
p  L 

precision.  Hence  if  the  system  order  n  is  large,  the  bound  is  propor- 

2 

tional  to  n.  However,  we  see  | |AL| |  is  proportional  to  n  for  large  n. 
Since  these  bounds  are  used  to  bound  the  relative  error  between  the 
computed  solution  and  exact  solution,  (3.40)  is  an  overestimation  when 
compared  with  practical  results. 

(b)  Computationally ,  using  (3.36)  is  not  only  practical  as  it 
enables  us  to  monitor  the  round-off  error  step  by  step,  but  it  is  also 
realistic  as  it  depends  on  both  matrix  L  and  the  n-vector  p.  For  example, 
if  n  =  1  and  p  =  0,  then  it  is  obvious  that  | |e| |  <  0  and  this  is  what 
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happens  in  actual  computation.  On  the  other  hand,  (3.40)  depends  only 
on  the  matrix  L,  hence  intuitively  and  computationally  it  is  a  "static" 
overestimation  with  very  little  information  regarding  what  actually 
happens  in  the  process  of  amputation. 

4.  The  general  systems. 

Now  we  can  consider  solving  a  general  system  of  linear  equations 
defined  as 

Az  =  p  (4.1) 

where  A  is  an  n-th  order  non-singular  matrix  and  p  is  an  n- vector.  It 
is  rather  trivial  to  show  that  by  properly  interchanging  rows  or  columns, 
the  permuted  A,  for  simplicity  we  will  still  call  it  A,  can  be  decomposed 
into  a  product  of  L,  and  U  such  that  A  *  LU  where  L  is  a  unit  triangular 
matrix  and  U  is  an  upper  triangular  matrix.  The  usual  row-pivoting 
strategy  makes  the  decomposition  possible  by  proper  row  interchanges. 

We  will  consider  the  partial  row-pivoting  strategy  in  which  a  rcw  is 
chosen  as  pivoting  rcw  such  that  it  has  the  largest  magnitude  coefficient 
for  the  variable  to  be  eliminated.  We  will  also  assume  that  row  permu¬ 
tations  are  done  in  advance  so  that  no  pivoting  is  necessary. 

Now  the  decomposition  consists  of  conputing  a  sequence  of  matrices 
A^  =  A,  A^\  •••,  A^n\  where  the  matrix  A^  is  zero  belcw  the 
diagonal  in  the  first  k-1  columns.  The  matrix  A^+1-*  is  obtained  from 
A^  by  subtracting  a  multiple  of  the  k-th  row  from  each  of  the  rows 
below  it;  the  rest  of  A^  is  not  changed.  The  multipliers  are  chosen 


13 


W 


would  have  zeros  be lew 


so  that  if  there  were  no  round-off  errors, 

the  diagonal  in  the  k-th  column.  We  do  not  calculate  these  elements  but 

take  them  to  be  zero  by  definition.  More  precisely,  let  Av  have  elements 

aPP .  Then  let 
ij 


mik  ■ 


lk 


(k) 


k+l  <  i. 


(4.2) 


and 


) 


(k+l) 

aij  " 


fa 


af«  - 

ij 


j  =  k,  k+l  £  i, 
k+l  £  j,  k+l  £  i, 
otherwise. 


(4.3) 


These  steps  are  carried  out  for  k  =  1,  2,  n-1.  Finally,  let 


U  = 


(4.4) 


and 


"l 

m21  1 
m_1  m32  * 

L  ®  • 

•  •  • 

%1  *V2  •  •  \ 

To  compute  (4.3),  let  us  further  define 


s 


0P  =  ft 

IJ 


k+l  <  j,  k+l  £  i. 


(4-5) 


(4.6) 
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So  we  have 


(0  ,  j  =  k,  k+1  £  i, 


J^a^j  otherwise. 

■Applying  Lenina  2.2  to  (4.2),  (4.6),  and  (4.7),  we  have 


U  *  6lk)mik  - 

k+1 

1  i, 

(4.8) 

a  ♦  «tj)Stw  - 

■"ikakj)- 

k+1 

<  j,  k+1  _<  i. 

(4.9) 

a(k+1)  =  0 
aik  0 

> 

k+1 

£  i> 

f *.l0 ) 

(1  +  l,.'.)a^k+1) 
U  iJ 

-  aPP  +  SPP 

ij 

> 

k+1  £  j ,  k+1  £  i, 

(4.11) 

-  aP° 

U  U 

otherwise. 

(4.17) 

Combining  (4.8)  and  (4.10),  we  have 

a(k*l)  =  0  =  _(k)  .  (k)  .  (k)&  k+,  <  . 

aik  0  aik  mikakk  mikakk  6ik»  k+1  <  i. 


Combining  (4.9)  and  (4.11),  we  obtain 


ij  i]  ikn<j  ij  ij 


(4.13) 


ij  ij  ik  kj  ij  ij  ij  ij’ 

(4.14) 

k+1  £  j ,  k+1  £  i. 

In  matrix  notation  (4.13),  (4.14),  and  (4.12)  are  conbined  to  give 

A(k+D  s  A(k)  _  L(k)A(k)  _  E(k)  (4 .15) 

IS 


where  Li 


:  00  _ 


with 


ePP 


mikakk^ik 

+^k)S..  +  a.(k+1)<5!., 
iJ  ij  ij  ij* 


k+1  <  i,  j  =  k, 
k+1  _<  i,  k+1  _<  j , 
otherwise, 


and 


(k) 


0 

0 


0 

Vl.k 

"*k+2,k 


m 


nk 


0 


Adding  (4. IS)  for  k  =  1,  2,  n-1,  we  have 

AC")  *  "j*  L»)ACfcJ  _  A_  "j1  E*>. 
k=l  k=l 

Since  the  matrix  i/kWk^  depends  only  upon  the  k-th  row  of  . 
which  is  equal  to  the  k-th  row  of  A^ ,  we  thus  have 


(4.16) 


(4.17) 


(4.18) 

(k) 
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\l+  l  L™  A(n)  =  A  -  j  E(k), 


(4.19) 


That  is, 

LU  =  A  -  T 

where  L  and  U  are  defined  by  (4.5)  and  (4.4)  and  where 

k=l 


To  bound  E  we  observe  that  since  nr  ^  the  use  of  pivoting 

implies  that  Kki  i 1  for  all  i,k.  Furthermore  from  (4.6)  we  have 


(4.20) 


(4.21) 


I j"*  I  i  I  akj }  I  *  k+1  <  j ,  k+1  <  i . 


(4.22) 


Hence  if  we  define 


o  =  max  |s{M|.  |a^|"|  =  max  fia[?l  » 
i, j  ,k  -  J  i,j,kL  J  J 


(4.23) 


then  from  (4.16)  we  have 


au  , 

j  =  k,  k+1  £  i, 

2au, 

k-+l  <_  i,  k+1  £  j, 

0 

otherwise . 

(4.24) 


Following  (4.21),  we  add  the  c jr  together  to  get  E.  And  we  have 
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2 

From  (4.25)  we  also  have  ||  |E|  | |  _<  (1+3+5+* • •+2n-3+2(n-l))au  =  (n  -l)ou. 
Thus  we  have  proved  the  following  theorem: 

Theorem  4.1.  The  matrices  L  and  U  computed  by  Gaussian  elimination 
with  row-pivoting,  using  floating-point  arithmetic,  satisfy 

LU  +  E  =  A.  (4.26) 

Furthermore , 

| |E| |  <  | |  | E |  | |  <  (n2-l)au.  (4.27^ 

Once  the  matrix  A  has  been  decomposed,  the  results  in  section  3  can 
therefore  be  used  to  solve  the  deconposed  triangular  systems.  Thus  after 
decomposition  we  have 

LU  +  E  =  A.  (4.28) 

Now  in  solving  LUz  =  p  in  the  sequence  Lv  =  p  and  Uz  =  v  by  substitution 
algorithm,  Theorem  3.2  tells  us  that  the  computed  v  and  z  satisfy 

Lv  +  =  p,  (4.29) 
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and 


Uz  +  =  v 


(4.30) 


where 


I e! !  £ 


L 

2+aL 

2(2)+o. 


2(n-l)+o, 


P  u. 
P  ’ 


|e2l  £ 


2(n-l)+a1J 


2(2' +au 


2+a, 


U 


Pvu. 


(4.31) 


Combining  (4.28),  (4.29),  and  (4.30),  we  have 


(A  -  E)z  =  p  -  e^  -  Le2- 


(4.32) 


Thus  the  computed  z  satisfies  a  new  system  with  perturbed  A  and  perturbed 
p.  Let  6A  =  -E,  Sp  =  -e^  -  I^,  the  bound  for  6/  is  estimated  as 

| |6A|  |  <  (n2-l)ou.  (4.33) 

Applying  Lenina  2.1  to  6p,  we  have 

I |6p| |  <  ||e1||  +  ||  |L|-|e2l  || 

£  [2(n-l)  +  PL]ppu  +  [n  -  n  +  no^pyU.  (4.54) 
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We  should  note  that  =  1  and  <_  a  from  the  definition  of  L  and  U. 
Furthermore,  if  we  denote  p  =  maxlp^.p^.],  then  (4.34)  can  be  simplified 
as 

|  1 6p |  |  _<  (n2  +  n  -  1  +  na)pu  (4.35) 

where 

p  =  max[p  ,PV].  (4.36) 

Thus  we  have  proved  the  following  theorem: 

Theorem  4.2.  The  solution  z  computed  by  Gaussian  elimination  with 
row-pivoting  and  substitution  satisfies  the  equation 

(A  +  6A)z  =  p  +  6p  (4.37) 

where  6A  =  -E  and  6p  =  -e^  -  I^.  Furthermore, 

| | 6A| |  <  (n2  -  l)au,  (4.38) 

I  I'SpI  |  ^_  (n2  +  n  -  1  +  no)pu.  (4.39) 

We  observe  that  6A  is  essentially  in  the  same  format  as  the  pertur¬ 
bation  matrix  obtained  by  Forsythe  and  Moler  [1]  in  the  decomposition  of 
A.  This  is  no  surprise  to  us  since  they  have  also  used  Lemma  2.2  in  part 
of  their  analysis.  However,  the  overall  result  is  different.  In  fact, 
their  result  shews  [1]  that  the  computed  z  satisfies 

(A  +  AA)  z  =  p  (4.40) 
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where 


|  |AA|  I  <  1.01(n3  +  3n2)ou.  (4.41) 

The  upper  bound  for  AA  in  (4.41)  is  therefore  proportional  to  n  for 
large  system  order  n.  The  comments  at  the  end  of  section  3  also  apply 
here. 

We  further  note  that  the  factor  n3  in  (4.41)  is  due  to  the  solution 

of  the  deconposed  triangular  systems.  Hence  if  we  use  higher  precision 

to  solve  the  deconposed  systems,  this  term  should  be  reduced  drastically 

and  hence  we  should  expect  to  have  much  more  accurate  results.  However, 

this  is  not  true  in  practical  observations.  Indeed,  if  the  decomposition 

is  already  in  error,  the  improvement  to  solution  accuracy  using  high 

precision  arithmetic  in  solving  the  triangular  systems  is  very  little, 

if  not  naught.  The  reason  can  be  explained  by  the  results  of  our  analysis. 

We  see  that  the  perturbations  due  to  decomposition  of  A  is  6A  and  the 

perturbations  due  to  the  solution  of  triangular  systems  is  6p.  The  upper 

bounds  for  5A  and  6p,  shown  in  (4.38)  and  (4.39),  show  that  they  are  of 
2 

the  same  order  n  for  large  n.  So  unless  higher  precision  arithmetic  is 
used  for  both  the  decomposition  of  A  and  the  decomposition  of  p,  there  is 
very  little  gain  in  using  higher  precision  arithmetic  in  only  one  process. 

5.  Conclusions . 

We  have  shewn  by  using  a  posteriori  error  analysis  that  the  pertur¬ 
bations  due  to  decomposition  process  and  due  to  solution  of  triangular 

7 

matrices  are  of  the  same  order  n  for  large  n.  This  approach  of 


I 


attributing  generated  errors  to  both  matrices  A  and  p  is  intuitively  and 
computationally  natural.  In  fact;  the  decomposed  L  and  U  are  kept  in 
computer  memory  and  are  not  perturbed  in  solving  the  triangular  systems. 
Hence  the  perturbations  in  the  solution  of  triangular  systems  should  be 
attributed  to  the  vector  p  which  is  actually  perturbed.  There  is  of 
course  another  advantage  of  using  a  posteriori  error  analysis:  that  the 
"dynamic"  behavior  of  the  computational  process  can  be  monitored  step  by 
step. 

We  should  also  note  that  the  "efficient"  Gaussian  process  is  essen¬ 
tially  an  "analytic"  process  [3].  In  other  words,  this  algorithm  tries 
to  decompose  p  such  that  Az  =  p  for  given  A,  p.  Algebraically  z  is 
unique  whether  it  is  obtained  by  satisfying  Az  =  p  or  by  directly  eval¬ 
uating  z  =  A  *p.  However,  computationally  the  closeness  of  Az  to  p  does 
not  guarantee  the  closeness  of  z  to  A_1p.  Hence  the  results  of  the  a 
posteriori  error  analysis  can  only  tell  us  the  difference  between  the 
computed  decomposition  LU  and  the  exact  decomposition  A  or  the  difference 
between  the  computed  decomposition  LUz  and  the  "exact"  decomposition  p. 

In  order  to  find  the  difference  between  computed  solution  z  and  the 
exact  solution  A  ^p,  we  need  to  knew  A  *  whose  information  has  been 
inadvertently  by-passed  in  the  Gaussian  process.  Therefore  "efficient" 
algorithms  are  not  necessarily  "good"  algorithms  in  other  respects. 
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